On the effectiveness of feature set augmentation using clusters of word embeddings

نویسندگان

  • Georgios Balikas
  • Ioannis Partalas
  • Massih-Reza Amini
چکیده

Word clusters have been empirically shown to o‚er important performance improvements on various tasks. Despite their importance, their incorporation in the standard pipeline of feature engineering relies more on a trial-and-error procedure where one evaluates several hyper-parameters, like the number of clusters to be used. In order to beŠer understand the role of such features we systematically evaluate their e‚ect on four tasks, those of named entity segmentation and classi€cation as well as, those of €ve-point sentiment classi€cation and quanti€cation. Our results strongly suggest that cluster membership features improve the performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Are Word Embedding-based Features Useful for Sarcasm Detection?

This paper makes a simple increment to state-ofthe-art in sarcasm detection research. Existing approaches are unable to capture subtle forms of context incongruity which lies at the heart of sarcasm. We explore if prior work can be enhanced using semantic similarity/discordance between word embeddings. We augment word embedding-based features to four feature sets reported in the past. We also e...

متن کامل

Optimum Ensemble Classification for Fully Polarimetric SAR Data Using Global-Local Classification Approach

In this paper, a proposed ensemble classification for fully polarimetric synthetic aperture radar (PolSAR) data using a global-local classification approach is presented. In the first step, to perform the global classification, the training feature space is divided into a specified number of clusters. In the next step to carry out the local classification over each of these clusters, which cont...

متن کامل

Intellectual structure of knowledge in Nanomedicine field (2009 to 2018): A Co-Word ‎Analysis

Introduction: The Co-word analysis has the ability to identify the intellectual structure of knowledge ‎in a research domain and reveal its subsurface research aspects.‎ Objective: This study examines the intellectual structure of knowledge in the field of nanomedicine ‎during the period of 2009 to 2018 by using Co-word analysis.‎ Materials and Methods: This paper develops a sciento...

متن کامل

A Correlated Topic Model Using Word Embeddings

Conventional correlated topic models are able to capture correlation structure among latent topics by replacing the Dirichlet prior with the logistic normal distribution. Word embeddings have been proven to be able to capture semantic regularities in language. Therefore, the semantic relatedness and correlations between words can be directly calculated in the word embedding space, for example, ...

متن کامل

مدل‌سازی بازشناسی واجی کلمات فارسی

Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1705.01265  شماره 

صفحات  -

تاریخ انتشار 2017